{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Xata\n",
    "\n",
    ">[Xata](https://xata.io) is a serverless data platform, based on `PostgreSQL` and `Elasticsearch`. It provides a Python SDK for interacting with your database, and a UI for managing your data. With the `XataChatMessageHistory` class, you can use Xata databases for longer-term persistence of chat sessions.\n",
    "\n",
    "This notebook covers:\n",
    "\n",
    "* A simple example showing what `XataChatMessageHistory` does.\n",
    "* A more complex example using a REACT agent that answer questions based on a knowledge based or documentation (stored in Xata as a vector store) and also having a long-term searchable history of its past messages (stored in Xata as a memory store)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup\n",
    "\n",
    "### Create a database\n",
    "\n",
    "In the [Xata UI](https://app.xata.io) create a new database. You can name it whatever you want, in this notepad we'll use `langchain`. The Langchain integration can auto-create the table used for storying the memory, and this is what we'll use in this example. If you want to pre-create the table, ensure it has the right schema and set `create_table` to `False` when creating the class. Pre-creating the table saves one round-trip to the database during each session initialization."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's first install our dependencies:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install --upgrade --quiet  xata langchain-openai langchain langchain-community"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we need to get the environment variables for Xata. You can create a new API key by visiting your [account settings](https://app.xata.io/settings). To find the database URL, go to the Settings page of the database that you have created. The database URL should look something like this: `https://demo-uni3q8.eu-west-1.xata.sh/db/langchain`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import getpass\n",
    "\n",
    "api_key = getpass.getpass(\"Xata API key: \")\n",
    "db_url = input(\"Xata database URL (copy it from your DB settings):\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create a simple memory store\n",
    "\n",
    "To test the memory store functionality in isolation, let's use the following code snippet:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain_community.chat_message_histories import XataChatMessageHistory\n",
    "\n",
    "history = XataChatMessageHistory(\n",
    "    session_id=\"session-1\", api_key=api_key, db_url=db_url, table_name=\"memory\"\n",
    ")\n",
    "\n",
    "history.add_user_message(\"hi!\")\n",
    "\n",
    "history.add_ai_message(\"whats up?\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The above code creates a session with the ID `session-1` and stores two messages in it. After running the above, if you visit the Xata UI, you should see a table named `memory` and the two messages added to it.\n",
    "\n",
    "You can retrieve the message history for a particular session with the following code:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "history.messages"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Conversational Q&A chain on your data with memory\n",
    "\n",
    "Let's now see a more complex example in which we combine OpenAI, the Xata Vector Store integration, and the Xata memory store integration to create a Q&A chat bot on your data, with follow-up questions and history."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We're going to need to access the OpenAI API, so let's configure the API key:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"OpenAI API Key:\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To store the documents that the chatbot will search for answers, add a table named `docs` to your `langchain` database using the Xata UI, and add the following columns:\n",
    "\n",
    "* `content` of type \"Text\". This is used to store the `Document.pageContent` values.\n",
    "* `embedding` of type \"Vector\". Use the dimension used by the model you plan to use. In this notebook we use OpenAI embeddings, which have 1536 dimensions."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's create the vector store and add some sample docs to it:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain_community.vectorstores.xata import XataVectorStore\n",
    "from langchain_openai import OpenAIEmbeddings\n",
    "\n",
    "embeddings = OpenAIEmbeddings()\n",
    "\n",
    "texts = [\n",
    "    \"Xata is a Serverless Data platform based on PostgreSQL\",\n",
    "    \"Xata offers a built-in vector type that can be used to store and query vectors\",\n",
    "    \"Xata includes similarity search\",\n",
    "]\n",
    "\n",
    "vector_store = XataVectorStore.from_texts(\n",
    "    texts, embeddings, api_key=api_key, db_url=db_url, table_name=\"docs\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After running the above command, if you go to the Xata UI, you should see the documents loaded together with their embeddings in the `docs` table."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's now create a ConversationBufferMemory to store the chat messages from both the user and the AI."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from uuid import uuid4\n",
    "\n",
    "from langchain.memory import ConversationBufferMemory\n",
    "\n",
    "chat_memory = XataChatMessageHistory(\n",
    "    session_id=str(uuid4()),  # needs to be unique per user session\n",
    "    api_key=api_key,\n",
    "    db_url=db_url,\n",
    "    table_name=\"memory\",\n",
    ")\n",
    "memory = ConversationBufferMemory(\n",
    "    memory_key=\"chat_history\", chat_memory=chat_memory, return_messages=True\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now it's time to create an Agent to use both the vector store and the chat memory together."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.agents import AgentType, initialize_agent\n",
    "from langchain.agents.agent_toolkits import create_retriever_tool\n",
    "from langchain_openai import ChatOpenAI\n",
    "\n",
    "tool = create_retriever_tool(\n",
    "    vector_store.as_retriever(),\n",
    "    \"search_docs\",\n",
    "    \"Searches and returns documents from the Xata manual. Useful when you need to answer questions about Xata.\",\n",
    ")\n",
    "tools = [tool]\n",
    "\n",
    "llm = ChatOpenAI(temperature=0)\n",
    "\n",
    "agent = initialize_agent(\n",
    "    tools,\n",
    "    llm,\n",
    "    agent=AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION,\n",
    "    verbose=True,\n",
    "    memory=memory,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To test, let's tell the agent our name:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "agent.run(input=\"My name is bob\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, let's now ask the agent some questions about Xata:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "agent.run(input=\"What is xata?\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Notice that it answers based on the data stored in the document store. And now, let's ask a follow up question:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "agent.run(input=\"Does it support similarity search?\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And now let's test its memory:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "agent.run(input=\"Did I tell you my name? What is it?\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
